Honor workflow-level Codex tuning in workflows by matzls · Pull Request #1215 · coleam00/Archon

matzls · 2026-04-14T11:59:02Z

Summary

make workflow-level Codex tuning effective at runtime for normal and loop nodes
add regression coverage for Codex override, fallback, and mixed-provider loop preservation
align Codex Archon docs and docs-web guidance with the implemented precedence

Validation

bun test src/dag-executor.test.ts
bun x tsc --noEmit
git diff --check

Notes

workflow-level precedence is now: workflow YAML -> assistants.codex config -> SDK defaults
peer-reviewed with the Codex engine after implementation and after the final regression/doc follow-up

Summary by CodeRabbit

New Features
- Added Codex-tuned assist workflow (archon-assist-codex) for general AI help and exploration
- Introduced interactive human-in-the-loop Plan-Implement-Validate development workflow with user approval gates
- Added automatic project detection to identify build tools and validation commands
- Workflows now adapt to selected AI assistant type (Claude vs Codex)
Documentation
- Comprehensive guides for workflow configuration, monitoring, debugging, and interactive operation
- CLI reference and assistant architecture documentation
- Troubleshooting and setup guidance

Chat platform adapters (Telegram, Slack, Discord) in @archon/adapters are pure transport and cannot call messageDb directly. Until now, only the Web adapter's PersistenceBuffer and the HTTP routes persisted messages, leaving telegram conversations with rows in remote_agent_conversations but zero rows in remote_agent_messages. The Web UI then rendered these conversations as empty. Add four persistence hooks inside handleMessage, gated strictly on platform.getPlatformType() === 'telegram' so the web path is completely untouched: 1. User message persistence after conversation creation + title generation but BEFORE the natural-language approval gate, so approval responses are captured. !message.startsWith('/') excludes deterministic slash commands. 2. Stream-mode assistant persistence after parseOrchestratorCommands, inside the "no retract" branch, so retracted /invoke-workflow text is never saved (matches Web's MessagePersistence.retractLastSegment semantics). 3. Batch-mode assistant persistence after platform.sendMessage succeeds, with the same retract guard. 4. Top-level catch persistence for error responses, so orphan user rows without assistant counterparts can't appear in the Web UI view. The conversation variable is hoisted out of the inner try block so the catch handler can reference it. All persistence errors are logged and swallowed — a DB hiccup must not break the user-facing Telegram reply. Gate is strict 'telegram' for MVP. Broadening to Slack/Discord/GitHub will require auditing those adapters' webhook replay behavior first. Known MVP limitations (will file as follow-ups): - Tool-call metadata not captured for telegram (web buffer still owns that) - Workflow dispatch progress messages from dag-executor not captured - Non-deterministic slash commands also excluded by the coarse startsWith('/') gate (acceptable — chat clients don't send ad-hoc slash commands) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The Web UI disables the message input for any conversation whose platform_type is not 'web' because Scope B (bidirectional bridging) isn't shipped yet. The old disabledReason string — "Continuing chats from other platforms in the Web UI is coming soon" — was both vague and increasingly misleading now that Telegram conversations render their full history in the Web UI. Replace the hardcoded string with a platform keyed lookup map so each platform gets a clear "reply from the originating app" hint. The disable condition itself is unchanged; this is pure copy + a small constant. Only telegram is functionally wired up (persistence hooks land in a sibling commit); slack/discord/github entries are forward-compatible and take effect as soon as persistence is broadened to those platforms. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add a focused test suite for the telegram persistence gate added to handleMessage. Covers the three load-bearing cases on the user-message side of the gate: 1. Natural-language telegram messages persist exactly one row with role 'user', the DB conversation id, the raw message text, and metadata { platformType: 'telegram' }. 2. Deterministic slash commands (/help) skip persistence entirely — neither user nor assistant rows are created. 3. Web-platform conversations do NOT trigger the centralized path, so web's existing MessagePersistence buffer still owns that flow. Assistant-message persistence hooks (inside handleStreamMode, handleBatchMode, and the top-level catch) require mocking sendQuery to yield actual content, which needs a more elaborate mock setup than the existing test file provides. Tracking that as a follow-up rather than blocking the MVP on it — the user-persistence path is the primary new logic and is covered here. A new mock.module('../db/messages', ...) is added near the existing DB mocks so that orchestrator-agent.ts's new messageDb import does not try to open a real DB connection. orchestrator-agent.test.ts runs in its own bun test invocation per packages/core/package.json, so the new mock does not pollute sibling orchestrator tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

This repo is a clone of coleam00/Archon and will evolve upstream as the project moves through beta. To keep local customizations persistent across upstream releases without merge chaos, the working copy is set up fork-first: origin → matzls/Archon (push access), upstream → coleam00/Archon (read-only). Add a "Fork & Upstream Integration" section to CLAUDE.md next to the existing Git Workflow content so future sessions have a single grounded reference for: - Which remote is which and what dev tracks - Where different customization types belong (personal config vs. upstreamable code changes vs. personal code changes) - The exact commands to integrate upstream releases (fetch + ff merge + rebase feature branches) - The exact commands to contribute back via gh pr create Intentionally short — this is routing guidance, not a git tutorial. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

9-node DAG PIV loop tuned for Codex behavioral tendencies: numbered SIGNAL EMISSION CONTRACTs, task-scoped implement loop (no repo-wide validators mid-task), pre-existing failure tolerance in code-review, per-file git staging, and tightened COMPLETE signal. Validated end-to-end on my-second-brain-build (Python/Obsidian vault) with 32 pre-existing ruff violations — workflow correctly scoped fixes to branch-introduced issues only. Also add root-level artifacts/ to .gitignore (workflow runtime output). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Reuse the deterministic slash-command allowlist for Telegram user-message persistence so slash-prefixed AI prompts are stored while ephemeral commands still skip persistence. Add a regression test covering /etc/hosts and stabilize the command-parser mock in the Telegram persistence test block. Co-authored-by: Codex <noreply@openai.com>

Context: preserve and land the Codex-specific assist workflow onto current dev while keeping the newer telegram persistence behavior already present on dev. Change: - add the bundled archon-assist-codex command and workflow defaults plus the tracked Archon skill files they depend on - default continue and orchestrator assist routing to archon-assist-codex when the assistant type is codex - extend server, web, docs, and core test coverage for the new workflow and the assistant-aware prompt-builder signatures Validation: - bun test packages/cli/src/commands/continue.test.ts - bun test packages/core/src/orchestrator/prompt-builder.test.ts - bun test packages/core/src/orchestrator/orchestrator.test.ts - bun test packages/server/src/routes/api.health.test.ts - bun test packages/server/src/routes/api.workflows.test.ts - bun test packages/web/src/lib/workflow-metadata.test.ts - bun test packages/workflows/src/defaults/bundled-defaults.test.ts - bun --filter @archon/cli type-check - bun --filter @archon/core type-check - bun --filter @archon/server type-check - bun --filter @archon/workflows type-check - bun --filter @archon/web type-check - bun run validate Codex-Session: 019d80c8-3cb7-79b1-8443-d09a42cb5020 Codex-Rollout: sessions/2026/04/12/rollout-2026-04-12T10-21-39-019d80c8-3cb7-79b1-8443-d09a42cb5020.jsonl Co-authored-by: Codex <noreply@openai.com>

Extract the detect-project workflow node into a reusable Bun script while preserving its stdout contract. Also tighten the Codex loop prompts so feedback fixes use per-file staging, scoped validation stays tool-accurate, and iteration violations fail without rewriting history. Extend typed ESLint coverage to .archon/scripts so the new script participates in the existing pre-commit checks. Co-authored-by: Codex <noreply@openai.com>

Document how assistant selection works across host skills, conversations, workflows, and nodes. Capture fork-specific Codex additions, upstream differences, and the current Codex limitations for workflow nodes. Co-authored-by: Codex <noreply@openai.com>

Preview upstream sync onto custom dev; auto-merged cleanly and passed bun run validate. Co-authored-by: Codex <noreply@openai.com>

Document the phase-by-phase session model for archon-piv-loop-codex, including fresh-context boundaries, interactive loop resume behavior, and a flow diagram for future review. Co-authored-by: Codex <noreply@openai.com>

Promote archon-piv-loop-codex into the bundled default workflow set and update the overview docs to advertise it there. Also fix the execution-notes path references after the workflow moved under .archon/workflows/defaults. Co-authored-by: Codex <noreply@openai.com>

Exclude CLI-backed workflow runs from the server startup orphan-failure sweep and document the distinction in the workflow authoring guide. This keeps server restarts from marking active CLI executions failed while they continue in a separate process against the same database. Co-authored-by: Codex <noreply@openai.com>

Add a reusable reference for repeated Archon log-debugging sessions and surface it from the Codex assist lane and the top-level Archon skill routing table. The new guide explains the three log layers, run discovery, JSONL filtering, event interpretation, and when to use UI or raw logs. Co-authored-by: Codex <noreply@openai.com>

- bundle the detect-project helper as a default script and resolve Archon default scripts when repo-local scripts are absent - stop PIV loop nodes early when git HEAD and task-progress tracking stop advancing - fail workflow CLI commands early when ~/.archon is not writable and clarify the sandbox failure mode in docs - persist richer DAG failure metadata for partial-run diagnostics Co-authored-by: Codex <noreply@openai.com>

- add routing guidance for monitoring, interactive relays, and log debugging in the Archon skill - add focused references for workflow monitoring cadence, paused-run relay behavior, and JSONL-first debugging - keep Archon follow-up handling grounded in the run status and per-run logs Co-authored-by: Codex <noreply@openai.com>

Add the generated PRD for workflow node display names to the Archon repo under docs/prd. The document keeps one PRD with a small execution-graph-only phase 1 and defers builder, non-graph execution surfaces, inference, and historical-fidelity questions to phase 2. Co-authored-by: Codex <noreply@openai.com>

Add the fork-level design doc that defines the Codex-first workflow surface, decision rules, and follow-on implementation sequence. Co-authored-by: Codex <noreply@openai.com>

Refine the Archon Codex skill and assist command so substantial implementation work routes to the Codex PIV lane, and add explicit worktree-proof/readback guardrails for assist-mode edits. Co-authored-by: Codex <noreply@openai.com>

Add a full Codex-first operator and authoring surface for the Archon skill, including workflow monitoring, debugging, repo init, command authoring, DAG authoring, CLI references, configuration guidance, and a Codex capability crosswalk. Correct the documented Codex parity boundaries so loop model/provider overrides are described accurately and workflow-level Codex tuning fields are called out as parsed but not runtime-effective per workflow in the current executor. Validation: - git diff --check - archon workflow list --json Co-authored-by: Codex <noreply@openai.com>

Make modelReasoningEffort, webSearchMode, and additionalDirectories effective from workflow YAML for Codex execution, with config fallback for normal and loop nodes. Add regression coverage for override, fallback, and mixed-provider loop preservation. Update Codex Archon references to match the implemented precedence. Co-authored-by: Codex <noreply@openai.com>

Update the assistant architecture reference to reflect that workflow-level Codex tuning fields now override Archon config with config fallback, matching the shipped runtime behavior. Co-authored-by: Codex <noreply@openai.com>

coderabbitai · 2026-04-14T12:00:25Z

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

This pull request introduces a comprehensive Codex-first workflow infrastructure, including two new bundled Codex-tuned workflows (archon-assist-codex, archon-piv-loop-codex), extensive reference documentation for workflow authoring and operation, support for bundled scripts with Codex execution, loop progress tracking with stuck-loop detection, and dynamic workflow selection based on assistant type in the CLI and orchestrator.

Changes

Cohort / File(s)	Summary
Codex Skills & Documentation `.agents/skills/archon/SKILL.md`, `.agents/skills/archon/agents/openai.yaml`	Added Archon skill definition with implicit invocation support and display configuration for Codex routing surface.
Codex Reference Guides `.agents/skills/archon/references/archoring-commands.md`, `cli-commands.md`, `codex-capability-crosswalk.md`, `configuration.md`, `interactive-workflows.md`, `log-debugging.md`, `monitoring.md`, `repo-init.md`, `variables.md`, `workflow-dag.md`	Comprehensive reference documentation covering workflow authoring, CLI operations, Codex capability mapping, configuration precedence, interactive loop control, debugging, monitoring, repo initialization, variable substitution, and DAG structure.
Codex Examples `.agents/skills/archon/examples/command-template.md`, `dag-workflow.yaml`	Command and workflow templates demonstrating Load/Execute/Report phases and DAG node patterns for Codex workflows.
Claude Skills `.claude/skills/archon/SKILL.md`, `.claude/skills/archon/references/log-debugging.md`	Updated routing entry and detailed logging/debugging reference for Claude-facing users.
Bundled Codex Workflows & Commands `.archon/workflows/defaults/archon-assist-codex.yaml`, `archon-piv-loop-codex.yaml`, `archon-piv-loop-codex.README.md`, `.archon/commands/defaults/archon-assist-codex.md`	New Codex-tuned catch-all assist workflow, full PIV (Plan-Implement-Validate) human-in-the-loop workflow with promise gates and session management, and fallback assist command with structured debugging guidance.
Bundled Script Discovery `.archon/scripts/detect-project.ts`, `tsconfig.json`	New project-type detection script supporting Bun, Node, Python, Go, Rust, and Makefile projects with environment-specific validation/install commands.
Workflow Execution Core `packages/workflows/src/dag-executor.ts`, `script-discovery.ts`, `validator.ts`, `schemas/loop.ts`	Enhanced DAG executor with: Codex tuning options propagation, bundled script resolution/execution, loop progress tracking with stuck-iteration detection, script error metadata tracking. Script discovery now merges repo and bundled defaults; loop schema adds `progress_file` and `stuck_after_no_progress_iterations`.
Bundled Defaults Registry `packages/workflows/src/defaults/bundled-defaults.ts`, `bundled-defaults.test.ts`, `script-discovery.test.ts`, `loader.test.ts`, `validator.test.ts`, `dag-executor.test.ts`	Registered new Codex workflows/commands and `detect-project` script in bundled assets; extensive test coverage for script bundling, loop stuck detection, and provider-level Codex tuning.
CLI Assistant Selection `packages/cli/src/commands/continue.ts`, `continue.test.ts`	Dynamic workflow selection: `continueCommand` now chooses `archon-assist-codex` for Codex assistants and `archon-assist` for Claude, with fallback to config settings.
CLI & Workflow Foundation `packages/cli/src/commands/workflow.ts`, `workflow.test.ts`, `cli.ts`, `package.json`	Added pre-flight SQLite write-access guard for state-mutating workflow commands; updated help text and test references for new `continue` tests.
Database & Metadata `packages/core/src/db/workflows.ts`, `workflows.test.ts`	Updated `failWorkflowRun` to accept optional metadata object (for `node_counts`/`failed_nodes`) and `failOrphanedRuns` to exclude CLI-owned runs from auto-failure on restart.
Orchestrator & Prompt Building `packages/core/src/orchestrator/orchestrator-agent.ts`, `orchestrator-agent.test.ts`, `orchestrator.test.ts`, `prompt-builder.ts`, `prompt-builder.test.ts`	Added Telegram user-message persistence, assistant-type threading through prompt builders, new `getAssistWorkflowName()` helper selecting Codex/Claude assist workflow, and routing-rule interpolation with correct workflow names.
Path Utilities `packages/paths/src/archon-paths.ts`, `index.ts`	New `getDefaultScriptsPath()` export for accessing bundled scripts directory.
Web API & Documentation `packages/server/src/routes/api.health.test.ts`, `api.workflows.test.ts`, `packages/web/src/components/chat/ChatInterface.tsx`, `lib/workflow-metadata.test.ts`	API mocks updated to include `archon-assist-codex` bundled defaults; platform-specific reply hints for non-web conversations; workflow display name/category parsing for `-codex` suffix.
Public Docs `packages/docs-web/src/content/docs/book/essential-workflows.md`, `first-five-minutes.md`, `quick-reference.md`, `getting-started/overview.md`, `guides/authoring-workflows.md`, `guides/index.md`, `guides/loop-nodes.md`, `reference/assistant-architecture.md`, `reference/cli.md`, `reference/index.md`, `reference/troubleshooting.md`	Comprehensive documentation additions covering Codex workflow variants, loop progress/stuck settings, assistant architecture and provider selection, CLI name resolution with `-codex` suffix, and SQLite write-access requirements.
Design & Policy `docs/design/codex-first-workflow-surface-strategy.md`, `docs/prd/workflow-node-display-names.prd.md`, `CLAUDE.md`	Codex-first design philosophy doc establishing curated asset policy and capability crosswalk; node display-name PRD; fork upstream integration guidance.
Build Configuration `.gitignore`, `eslint.config.mjs`	Updated ignore patterns to allow `.agents/skills/` and `artifacts/` directory; extended ESLint type-checking to `.archon/scripts/`.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

feat: script node type for DAG workflows (bun/uv runtimes) #999 — Adds bundled script support and modifies script-discovery, dag-executor, and validator code paths used throughout this diff for detecting/resolving named scripts.
refactor: extract providers from @archon/core into @archon/providers #1137 — Introduces provider abstraction and type plumbing that overlaps with this PR's Codex tuning-option propagation and provider-aware dag-executor logic.
feat: Phase 2 — community-friendly provider registry system #1195 — Overlaps in prompt-builder and orchestrator changes for passing assistant type and adding Codex-specific workflow routing.

Poem

🐰 Scripts hop from bundled defaults, workflows dance in Codex lanes,
Loop promises gate the progress, stuck detection stops the pain.
From assist to piv to debug—each surface shines and true,
This fork finds its footing, Codex-first through and through! 🎯

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 59.68% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check	❓ Inconclusive	The description is incomplete relative to the template, missing critical sections like UX journey, architecture diagrams, and required validation/risk/compatibility details.	Complete the description by adding UX journey flows, architecture diagrams, module connections, label snapshot, validation evidence, security/compatibility/side-effect/rollback analysis, and verified scenarios beyond CI.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: implementing runtime support for workflow-level Codex tuning configuration, which is the primary objective of this PR.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

⚔️ Resolve merge conflicts

Resolve merge conflict in branch codex/archon-skill-parity

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Wirasm · 2026-04-14T12:13:03Z

Thank you for contributing, but this PR is way to large to review, please reopen smaller pieces of work

matzls and others added 23 commits April 11, 2026 11:03

Merge branch 'feature/cross-platform-conversation-visibility' into dev

0b1eca9

Merge upstream/dev into preview/upstream-dev-merge-2026-04-12

492c40c

Preview upstream sync onto custom dev; auto-merged cleanly and passed bun run validate. Co-authored-by: Codex <noreply@openai.com>

docs(workflow): add piv loop codex execution notes

183ecb6

Document the phase-by-phase session model for archon-piv-loop-codex, including fresh-context boundaries, interactive loop resume behavior, and a flow diagram for future review. Co-authored-by: Codex <noreply@openai.com>

docs(design): define Codex-first workflow surface strategy

7cf4e23

Add the fork-level design doc that defines the Codex-first workflow surface, decision rules, and follow-on implementation sequence. Co-authored-by: Codex <noreply@openai.com>

docs(archon): tighten Codex assist workflow guidance

84bf6ce

Refine the Archon Codex skill and assist command so substantial implementation work routes to the Codex PIV lane, and add explicit worktree-proof/readback guardrails for assist-mode edits. Co-authored-by: Codex <noreply@openai.com>

docs(archon): align codex workflow tuning docs

61b2d2e

Update the assistant architecture reference to reflect that workflow-level Codex tuning fields now override Archon config with config fallback, matching the shipped runtime behavior. Co-authored-by: Codex <noreply@openai.com>

Wirasm closed this Apr 14, 2026

matzls deleted the codex/archon-skill-parity branch April 14, 2026 13:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Honor workflow-level Codex tuning in workflows#1215

Honor workflow-level Codex tuning in workflows#1215
matzls wants to merge 23 commits intocoleam00:devfrom
matzls:codex/archon-skill-parity

matzls commented Apr 14, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 14, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

Wirasm commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

matzls commented Apr 14, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Notes

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

Wirasm commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

matzls commented Apr 14, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 14, 2026 •

edited

Loading